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Abstract 

In this article, we discuss the optimal allocation problem in an experiment when a re- 
gression model is used for statistical analysis. Yu (Monotonic Con- vergence of a general 
algorithm for computing optimal designs. Annual of Statistics, 38, 2010: 1593-1606) has 
established the monotonic convergence for a general class of multiplicative algorithms for 
D-optimality. Here, we provide an alternate proof of the monotonic convergence for D- 
criterion with a simple computational algorithm. We also discuss an algorithm as well as 
a conjecture of the monotonic convergence for A-criterion. Monte Carlo simulations are 
used to demonstrate the reliability, efficiency and usefulness of the proposed algorithms. 
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1 Introduction 

Regression analysis is an useful technique in modeling and analyzing several variables, when 
the focus is on the relationship between a dependent variable and one or more independent 
variables. It is widely used in different fields of study. For instance, in reliability and life- 
testing experiments, often one of the primary purposes is to study the effect of covariates 
on the failure time distribution and to develop inference on the survival probability or some 
other reliability characteristic of an equipment. For this purpose, a regression model is used to 
incorporate these covariates in the statistical analysis. 
Consider a general regression model 

Y = »(x,/3)+ae (1) 

where Y is the response variable, fj,(x, j3) is a known function which depends on the unknown 
parameters (3 e ffl and the p covariates x = (x±, . . . ,x p )'; and e is a random variable with 
E(e) = and Var(e) = 1. We can rewrite the combinations of different levels in different 
covariates into k experimental conditions (or design points) represent by x\ = (xu x 2 i ... x p i), 
where xu is one of the levels of the i-th covariate. Note that when the intercept term present 
in the regression model, xu = 1, I — 1, . . . , k. 

Suppose that in an experiment, we have N items available for the test at k experimental 
conditions. We assign ni items for testing at experimental condition xi (I = 1,2, ... ,k) with 

k 

Y^ni = N , and observe the corresponding observations for estimation of parameters and/or pre- 
i=i 

diction. In planning such an experiment, we have the flexibility in the choice of (ni, 112, ■ ■ ■ , n&) 
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for a given value of N and the experimental conditions Xi,l = 1,2, ... ,k. Here, we consider 
the problem of optimal allocation of n\, n-i, . . . , for a general regression model. This optimal 
allocation problem is usually referred as optimal design problem in the literature. 

The optimal design problem in regression setting has long been studied in the literature, 
for example, Elfving (1952), Fedorov (1972), Silvey (1980). For extensive developments in op- 
timal design, one may refer to Silvey (1980), Box and Draper (1987), Seber and Wild (1989), 
Atkinson and Donev (1992), Liski et al. (2002) and a concise introduction by OBrien and 
Funk (2003). Besides the rich development in optimal design theory, different numerical com- 
putational algorithms have been proposed to obtain optimal designs under different scenarios. 
For instance, when (3) is linear functions of (3, Wynn (1970) proposed a W-algorithm and 
Fedorov (1972) proposed a V - algorithm to search for the optimal design. Following the ideas 
in Wynn (1970) and Fedorov (1972), Mitchell (1974) proposed an algorithm for the maximiza- 
tion of |X T X|, where X is the design matrix. Then, Cook and Nachtsheim (1980) provided 
an empirical comparison of existing algorithms for the computer generation of exact D-optimal 
experimental including those due to Wynn (1970), Fedorov (1972) and Mitchell (1974) and 
proposed a modification of the Fedorov (1972) algorithm. However, as pointed out by Silvey 
(1980, p. 34), these early algorithms have been criticized on the grounds of their slow conver- 
gence and some algorithms have been suggested to increase the speed of convergence (see, for 
example, Atwood, 1973, Silvey and Titterington, 1973 and Wu, 1978). Meyer and Nachtsheim 
(1995) proposed a cyclic coordinate-exchange algorithm for constructing of D-optimal designs 
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mainly for continuous design space. As they stated in Section 2.4, "For finite design spaces, 
the procedure is conceptually simple, although the computational demands can be prohibitive 
when q (dimensions of covariates) is large." Vandenberghe, Boyd and Wu (1998) have proposed 
the interior-point method to deal with more general problems but it also suffer from the slow 
convergence problem. Recent papers of Torsney and Mandal (2006), Harman and Pronzato 
(2007), Dette, Pepelyshev and Zhigljavsky (2008) and Torsney and Martin-Martin (2009) de- 
veloped numerical computational algorithms for D-optimal designs. Yu (2010) discussed the 
monotonic convergence for a general class of computational algorithms for D-optimal design. 
In general, it is desirable to have a numerical computational algorithm to obtain optimal de- 
signs which (i) is simple and reliable; (ii) can be applied in general situations; (iii) has a high 
convergence rate. 

In this paper, we aim to develop efficient computational algorithms to obtain optimal al- 
location for a general regression model subject to the D- optimality and A-optimality criteria. 
Mathematical results related to the convergence and monotonicity of the proposed algorithms 
are developed. An extensive simulation study is performed to show the reliability of these 
algorithms. In Section 2, we consider the likelihood inference based on a general regression 
model and present the forms of the expected Fisher information matrix and the asymptotic 
variance-covariance matrix. The two optimal criteria are also discussed in Section 2. Then, 
the proposed computational algorithms for D-optimality and A-optimality criteria and their 
relatedmath- ematical properties are discussed in Section 3. Finally, concluding remarks are 
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provided in Section 4. 

2 Model and Optimal Criteria 
2.1 Model and Notations 

One of the commonly used approaches to estimate the unknown parameters in a regression 
model in ([1]) is the maximum likelihood method. The maximum likelihood estimates (MLEs) 
are obtained by maximizing the likelihood function subject to the unknown parameters, (3. The 
properties of MLEs of the regression model are then evaluated based on the asymptotic theory 
of MLEs. When fi(x, f3) is a linear function of /3, the expected Fisher information matrix of 
the MLE of f3 can be expressed as a function of n\, . . . , n k , 



where n\ is the number of repeated observations or measurements under the experimental 
condition x\. Thus, the asymptotic variance- co variance matrix of the MLE of f3, which is 
the inverse of the expected Fisher information matrix, can also be expressed as a function of 
ni, ■■■,n k . 

We can also write the expected Fisher information in terms of wi = rii/N, where wi = rii/N 
is the proportion of units (of a total of iV units under test) to be assigned to experimental 
condition xi, I — 1, 2, . . . , k, 



S(m, ...,n k ) 



1 r n 
— riiX\X 1 H Yn k x k x k , 



(2) 



£(io) = ...,w k ) = N(wiAi + w 2 A 2 H h w k A k ) 



(3) 
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where A\, . . . , are known nonnegative definite matrices which are functions of Xx, . . . ,Xk 
and wi = ni/N is the proportion of units (of a total of N units under test) to be assigned 
to experimental condition Xi, I = 1,2, ...,k. The related applications that the inverse of 
covariance is decomposed into sums of linear nonnegative definite matrices has been considered 
by Vandenberghe, Boyd and Wu (1998), and Qu, Lindsay and Li (2000). The optimal allocation 

problem is equivalent to obtain the values of w = (wx,W2, ■ ■ ■ ,Wk) which optimized a specific 

k 

optimal criterion subject to the constraints Wi > (I = 1,2,..., A;) and Y2 w i = 1- ^ * s 

i=i 

noteworthy that if the expected Fisher information of the MLEs can be expressed in the form 
of ([2]) or ([3]), then the algorithms proposed in our manuscript are applicable. We can show that 
many of the commonly used regression model, for example, multiple regression model with 
normal errors, Weibull (extreme-value) regression model and Birnbaum-Saunders regression 
model, have expected Fisher information of the MLEs in the form of ([2]) or ([3]) and thus our 
proposed algorithm are applicable in those situations. 

2.2 Optimal Criteria 

The goal here is to determine the optimal planning of an experiment using regression models 
for analysis. We can determine the optimal allocation subject to different optimality criteria. 
If we are interested in the estimation of the model parameters, we may consider optimality in 
terms of: 

[CI] Maximization of the determinant of the Fisher information matrix S: This criterion is 
D- optimality, wherein the determinant of the Fisher information matrix is maximized, 
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which results in minimum volume for the Wald-type joint confidence region for the model 
parameters (J3, a), w* is a D-optimal allocation for (j3]) if and only if 

f k } 

w* = argmin < — log(|E(tr)|) : subject to wi > and = 1 > • (4) 



i=i 

[C2] Minimization of the trace of the variance-covariance matrix (S _1 ) of the MLE's: This 
criterion is A-optimality which minimizes the sum of the variances of the parameter esti- 
mates and provides an overall measure of variability from the marginal variabilities, w* 
is a v4-optimal allocation for ([3]) if and only if 

r fc ] 

w* = argmin < — log(trace(S : subject to Wi > and ^^W; = 1 > . (5) 



i=i 



3 Proposed Computational Algorithm 

In this section, we propose the computational algorithms to obtain the D-optimal and A-optimal 
choices of w. The properties of these algorithms are discussed. 

3.1 Algorithm for D-optimal allocation 

Theorem 1. w* is the D-optimal choice for ([3]) if and only if 

trace^E" 1 (w*)) = p for < ^ (6) 

and 

trace(A i S" 1 (iu*)) < p for = 0. (7) 

Proof: Let S(w) = log(|S(w)|), it is easy to check that S(w) is convex in w. By Kuhn- 
Tucker conditions (Kuhn and Tucker, 1951), w* is the optimal solution of (4) if and only if for 



all w(wj > and J2 w j = 1)> 

Q ^dS^uf) = _YtieLce(A l 'E- 1 (w*))(w l -wt) 

ow z — ' 

1=1 

k 

= - y^trace(A;S~ 1 (tt;*))w; +p, 



Z=l 



which implies fl6]) and ([7]). Note that Eq. (6) given in Theorem 1 is consistent with the General 
Equivalence Theorem (Kiefer and Wolfowitz, 1960). 

Based on ([6]) and (J7j), the following iterative algorithm to obtain the .D-optimal allocation 
in (j3J is proposed. 

Algorithm for D-optimal allocation 

1. Set the initial value of w as = (1/k, ■ ■ ■ , 1/k)' . 



2. In the h-th step, update the value of w as 



(h) (h-i) 



trace(A i S- 1 (^(' 1 - 1 ))) 



V 

for I — 1, • • • , k. Note that when Aj = Xix[, ([8]) can be expressed as 



(8) 



(h) (h-i) 



p 



3. Repeat step 2 until the algorithm converge. One of the stopping rule based on absolute 
difference is stop when max j — w^ 1 ^ | < C> where ( is a small number specified 
by the user. 

Yu (2010) has established the monotonic convergence for multiplicative algorithms for D- 
optimality. Here, we provide an alternate proof of the monotonic convergence. 



CO a „(k-i). 



1=1 



Theorem 2. Let be given by ([8]), and then 

fc 

log|£(t»W)| -loglS^- 1 ))! ^pVui^Iogi > | 

1=1 ™z ^ 

and 

„,(*) _ 0. 

r-pi f trace^S- 1 ^- 1 ))) ) (fc) , 

Ihe sequence < > converges to 1 it w y ' converges and lini/ l _j. 00 w, 

IP) 

ftracefA^- 1 ^- 1 )))! r fM 

0; the sequence < > converges to no more than 1 it w [ ' converges and 

I P J 

lim^oo = 0. Thus, by Theorem 1, it converges to the D-optimal solution of (|1J). For 

the optimal allocation (j4]), its solution may not be unique and we can choose different initial 
values and get different optimal solutions. Compared with the W-algorithm proposed by Wynn 
(1970) and the V -algorithm proposed by Fedorov (1972), the value of w in the current step in 
our proposed algorithm is an explicit function of the value in the previous step which involves 
simple matrix manipulation while the value of w in each step of the W- and V -algorithms 
involve maximizations which required numerical procedures in computation. Therefore, the 
algorithms proposed here converge quicker and they are easy to program. 

In order to study the convergent rate of the proposed algorithm, an extensive simulation 
study is performed. We generate the form of the Fisher information matrix in ([3]) with the 
elements of Xi,x 2 ,-- - ,x k being independent identically uniform distributed in between -1 
and 1, i.e., U(— 1,1), for number of design points k = 10,20,30,40 and number of covariates 
p = 4,5, 8, 10, 15, 20, 25, 30 with p < k. The stopping criteria of the algorithm are set to be 



max | — w f"~^ 1 1 < 0.0001. For each combination of p and k, 50 replications are simulated 
and their corresponding D-optimal allocations are found by using the proposed algorithm. The 
number of iterations and the elapse time (in unit of second) required to obtain the D-optimal 
allocation are recorded and their average values (with standard deviations in parenthesis) are 
presented in Table 1. 

3.2 Algorithm for A-optimal allocation 

Theorem 3. w* is the A-optimal choice for ([3]) if and only if 

trace(S- 1 (iu*)A i S" 1 (iu*)) 
trace(£ -1 (tt;*)) 

and 



1 for wi ^ (9) 



t r ace(S-' (u ,-)A,Sr'K)) < for = 

trace(S~ 1 (w*)) ' K J 

Based on (jUJ) and ffTUj) . the following iterative algorithm to obtain the A-optimal allocation 
in (jSJ) is proposed. 

Algorithm for A-optimal allocation 

1. Set the initial value of w as = (1/k, • • • , 1/k) . 

2. In the h-th step, update the value of w as 

"trace(S- 1 (tt;(' l - 1 ))A i S- 1 (i ( ;( /l - 1 ))) 



(ft-i) 



V 



+ p- 1 



trace(S" 1 (iu( /l - 1 ))) 
for I — 1, • • • , k. Note that when Aj = Xix[, (ITTj) can be expressed as 

»-i) 



; = — — 
p 



trace(S~ 1 (iu( /l - 1 ))) 
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+ p- 1 



3. Repeat step 2 until the algorithm converge. One of the stopping rule based on absolute 
difference is stop when max — Wj h ^|| < (, where ( is a small number specified 

by the user. 

Although a theoretical justification of convergence of the proposed computational algorithm 
for A-optimality similar to Theorem 2 is not yet avail- able, simulation results strongly sup- 
port the validity and reliability of the algorithm. An extensive simulation study with settings 
presented in Section 3.1 is performed to study the properties of A-optimality. We have gen- 
erated a wide range of settings and use our algorithm to compute the A-optimal allocation 
and we found that the algorithm converge in all the cases. The number of iterations and the 
elapse time (in unit of second) required to obtain the A-optimal allocation are recorded and 
their average values (with standard deviations in parenthesis) are presented in Table 2. We 
conjecture the monotonic convergence of the algorithm for A-optimality and the theoretically 
prove of convergence of the proposed computational algorithm for A-optimality seems to be an 
interesting open problem. 

4 Concluding Remarks 

We have proposed simple and efficient iterative algorithms to obtain the D-optimal and A- 
optimal allocations for general regression model. We have provided an alternate proof of the 
monotonic convergence of the proposed algorithm for D-optimality which ensure that the al- 
gorithm converges to optimal allocation. We have also shown that the proposed algorithm for 
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A-optimality converges by an simulation study and conjecture the the mono- tonic convergence 
of the proposed algorithm for A-optimality. The proposed computational algorithms converges 
fast and they are easy to program. These algorithms are programmed in R (R Development 
Core Team, 2011) and the programs are available from the authors upon request. 

Appendix: Proof of Theorem 2 

Lemma 1. 7T = (tti, ir 2 , ■ ■ ■ , n k ) and = (9i, 9 2 , . . . , 6 k ) are two probability vectors in 9ft fc , and 
then 



2^7T, log(7T,/0, 



1=1 



1/2 



>£>-0,|. (12) 



i=i 



Proof: See the proof given by Kullback (1967), Csiszar (1967) or Kemperman (1969). 

k 

Lemma 2. Let S(io, d) = ^w^Ai, and then £(iu) = E(iu, 1) and 

i=i 

k 

-log[|S(w,d)|] - -log[|S(w)|] - VVlogd, > 0. (13) 

where 

trace^ST^u?)) 
V 

Proof. For d x > 0, . . . , d k > 0, let 



V 

and then 



1 k 
g(d u ...,d k ) = -log[|E(w,d)|] -^wi logd t , 



dg(di, . . . ,d k ) tracers 1 (w,d)) w t 



=Wl a ' 

odi v di 
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d 2 g(d 1 ,...,d k ) trace( A l 1 E~ 1 (w,d)A j 1 E~ 1 w,d)) , , 
= -wiWj = , I ^ J 



dd[ddj 



V 



d 2 g{d 1 , ... ,4) 
dd 2 



-w, 



,trace(A;S 1 (w,d)Ai'E 1 (w,d)) wi 
' P + ? 



d 2 g d 2 g d 2 g 
dd\ ddidd 2 dd\dd k 



d 2 g d 2 g 



dd 2 dd 1 dd\ 



d 2 g d 2 g 



ddkddi dd k dd 2 



d 2 g 
dd 2 dd k 



d 2 g 
ddl 



> 



d=i 



and dg(l, . . . , l)/ddi = 0, and then (1, . . . , 1)' is the local minimum point. By the fact g(X x 
di, . . . , A x d k ) = g(di, . . . , d k ) for A > 0, (1, . . . , 1)' is the global minimum point and so 



g(di,...,d k ) > g(l,...,l) 



which implies that the Lemma holds. 



Lemma 3. log |E(u>W)| > log ^(w^-^)]. 

Proof: Since wf^ = wf 1 ' 1 ^ 1 | ^ race (^^ — )) ^ L emma 2 ; 



P 



log|S(toW)| -loglS^- 1 ))! >px ^^ _1) log 



i=i 



trace(A,S- 1 (io( /l - 1 ))) 



V 



> 



13 



by Lemma 1. Thus the conclusion holds. 



Lemma 4. A is a nonnegative matrix and then 



\A\ < Y[a u , 



i=i 



where the (i, j)-th element in A. 

Proof: See the results given by Anderson (2003). 



Proof of Theorem 2: By Lemma 4, 



(14) 



iE(«>w)i < n 



i=i 



E 

.1=1 



(h) 2 
a il 



< Y[max{al; I = !,-•• ,£;}, 



i=i 



that is, the sequence {log |S(to^)|} is uniformly bounded. By Lemma 3, we also know the 
sequence {log |S(tw^)|} increasing in n, and thus it converges. 
We have 



0= lim [log|£(™ w )| - log | E^- 1 )) |1 > lim pxVwf^log 



h— >oo 



(=1 



trace(A / S- 1 (^( /l - 1 ))) 



P 



> 



which implies 



= lim w\ h ^ log 



1=1 



trace^E -1 ^- 1 ))) 



V 



^ i W ( 



(h-i). 



i=i 



and ™ w - w^- 1 ) ->■ 0. 
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Table 1: Simulated results for D-optimality 



Average no. of Average elapsed 



k 


P 


iterations (s.d.) 


time in sec. (s.d.) 


10 


4 


56.8 (27.9) 


u. iyz 


(U.Uy4 ) 




5 


41.9 (16.4) 


U. 14o 


(U.UDo J 




8 


19.5 (10.6) 


U.UD i 


(U.U4U ) 


20 


4 


96.1 (69.0) 


n A Q 

U.648 


(0.4b 1 ) 




5 


77.9 (29.6) 


n CC Q A 

U.534 


(0.203 j 




8 


51.3 (11.5) 


0.370 


(0.085) 




10 


36.0 (9.5) 


U.ZD 1 


(U.U 1 6 ) 




15 


15.8 (4.7) 


u. izy 


(U.U4U ) 


30 


4 


115.1 (94.8) 


1 1Q1 


( C\ OQ9^ 

(u.yoz j 




5 


99.3 (35.9) 


i nno 

i.uyz 


(U.ooy j 




8 


60.1 (16.7) 


U.o49 


(U.179J 




10 


50.2 (14.7) 


U.ODO 


( c\ i cn\ 
[U.lOZ ) 




15 


29.0 (4.5) 


U.O04 


(U.Uo / ) 




20 


17.9 (3.9) 


0.255 


(0.054) 




25 


9.7 (2.8) 


0.155 


(0.046) 


40 


4 


124.4 (63.9) 


1.673 


(0.853) 




5 


104.7 (38.6) 


1.431 


(0.530) 




8 


72.8 (28.5) 


1.053 


(0.419) 




10 


52.2 (10.2) 


0.791 


(0.153) 




15 


36.7 (6.2) 


0.608 


(0.104) 




20 


26.0 (6.0) 


0.497 


(0.119) 




25 


16.7 (3.5) 


0.362 


(0.077) 




30 


10.7 (2.1) 


0.273 


(0.055) 
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Table 2: Simulated results for A-optimality 







Average no. of 


Average 


elapsed 


k 


P 


iterations (s.d.) 


time in sec. (s.d.) 


10 


4 


126.2 (77.0) 


1 (117 
1.U1 1 


(c\ ci o\ 




5 


121.5 (67.3) 


U.yy4 


(U.004 ) 




8 


67.6 (20.7) 


n c:ori 

U.ooU 


(U. loU J 


20 


4 


188.0 (79.3) 


o no o 

3.033 


(1.283) 




5 


169.0 (73.9) 


2.784 


(1.218) 




8 


119.5 (30.0) 


2.062 


(0.514) 




10 


97.5 (18.0) 




/fl Q1 C\ 
(U.olo ) 




15 


72.6 (20.2) 


1.4Z 1 


(U.oyy ) 


30 


4 


218.3 (84.6) 


K OQ A 


(Z.U40 ) 




5 


200.0 (68.0) 


A OOQ 

4.933 


(1.682 j 




8 


155.8 (38.6) 


4.039 


(1.004 J 




10 


128.7 (30.0) 


9. AAA 
0.444 


(u.ouy j 




15 
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